Short-term memory traces for action bias in human reinforcement learning.

نویسندگان

  • Rafal Bogacz
  • Samuel M McClure
  • Jian Li
  • Jonathan D Cohen
  • P Read Montague
چکیده

Recent experimental and theoretical work on reinforcement learning has shed light on the neural bases of learning from rewards and punishments. One fundamental problem in reinforcement learning is the credit assignment problem, or how to properly assign credit to actions that lead to reward or punishment following a delay. Temporal difference learning solves this problem, but its efficiency can be significantly improved by the addition of eligibility traces (ET). In essence, ETs function as decaying memories of previous choices that are used to scale synaptic weight changes. It has been shown in theoretical studies that ETs spanning a number of actions may improve the performance of reinforcement learning. However, it remains an open question whether including ETs that persist over sequences of actions allows reinforcement learning models to better fit empirical data regarding the behaviors of humans and other animals. Here, we report an experiment in which human subjects performed a sequential economic decision game in which the long-term optimal strategy differed from the strategy that leads to the greatest short-term return. We demonstrate that human subjects' performance in the task is significantly affected by the time between choices in a surprising and seemingly counterintuitive way. However, this behavior is naturally explained by a temporal difference learning model which includes ETs persisting across actions. Furthermore, we review recent findings that suggest that short-term synaptic plasticity in dopamine neurons may provide a realistic biophysical mechanism for producing ETs that persist on a timescale consistent with behavioral observations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dissociations within short-term memory in GluA1 AMPA receptor subunit knockout mice

GluA1 AMPA receptor subunit knockout mice display a selective impairment on short-term recognition memory tasks. In this study we tested whether GluA1 is important for short-term memory that is necessary for bridging the discontiguity between cues in trace conditioning. GluA1 knockout mice were not impaired at using short-term memory traces of T-maze floor inserts, made of different materials, ...

متن کامل

The effect of intrahippocampal microinjection of Naloxone on short –term and long-term memory in adult male rats

Introduction:The hippocampus is one for the major centers of learning and memory. Role of the opioid system has been investigated and on the other hand receptors related to this system such as mu-opioid receptors (MOR) are extended in the hippocampus. In this study the effect of Naloxone administration as a mu opioid receptor antagonist on passive avoidance memory in adult male rats was i...

متن کامل

Bidding Strategy on Demand Side Using Eligibility Traces Algorithm

Restructuring in the power industry is followed by splitting different parts and creating a competition between purchasing and selling sections. As a consequence, through an active participation in the energy market, the service provider companies and large consumers create a context for overcoming the problems resulted from lack of demand side participation in the market. The most prominent ch...

متن کامل

Generating Text with Deep Reinforcement Learning

We introduce a novel schema for sequence to sequence learning with a Deep QNetwork (DQN), which decodes the output sequence iteratively. The aim here is to enable the decoder to first tackle easier portions of the sequences, and then turn to cope with difficult parts. Specifically, in each iteration, an encoder-decoder Long Short-Term Memory (LSTM) network is employed to, from the input sequenc...

متن کامل

The role of hippocampal nitric oxide in passive avoidance learning

Abstract: Introduction: Nitric oxide (NO) is a retrograde messenger in hippocampal synaptic plasticity which involves in learning and memory processes. Previous studies revealed that hippocampal pyramidal cells contain NO synthase (NOS) enzyme which produce NO and could be a promising target to evaluate the role of NO in brain cognitive functions. So in this study, using NOS inhibitor (L-NAME)...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Brain research

دوره 1153  شماره 

صفحات  -

تاریخ انتشار 2007